Welcome to my first article entry. Here I will give a step-by-step explanation of my implementation of a QR code scanner.


General

Before actually start the implementation, let's think about the general steps a QR code generator needs to do. Usually, we have as data a URL or more generally, a bunch of characters. In order to transform them into a black-and-white-only pattern, we need to encode them somehow. This is the first step. After this, we need to think about error correction. This is actually the most important step, as real-world QR codes come very rarely in a pure and non damaged format and thus usually some white/black squares are flipped or simply not visible. Thus, our initially encoded data bits need to be extended to include some redundancy that helps to recover from errors. Lastly, we need to layout all the generated bits in a square matrix inorder to produce the actual QR code. Therefore, a rough sketch of the different steps can be made:

  1. Encode
  2. Error Correction
  3. Layout

Of course, there is a lot more to it than this simple explanation. But I'll leave the details in the respective sections below in order to make the discussion of them more vivid and not so dry. So let's start. Define two different classes to keep everything clean: QRCodeGenerator, QRCodeConstants:

/*
Class for generating a QR code.
*/
class QRCodeGenerator {}

/*
    Utility class used (mainly) for the encoding part.
*/
class QRCodeUtility {}

Encoding

Analyze input

The QR code specification divides the encoded into different classes. Depending on what you want to encode, a different class is used. The reason for this is that if you only use a very limited alphabet then more data can be encoded. Specifically, there are four different classes:

  1. Numeric

    Allows the encoding of the numbers 0 - 9 but nothing else.

  2. Alphanumeric

    Allows the encoding of the numbers 0 - 9, the characters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z $ % * + - . / : and additionally the whitespace [BLANK]. As you see, no lowercase letters are included. This means, that for the string "Hello World" you need to use the byte encoding mode.

  3. Bytes

    Encodes arbitrary bytes.

  4. Kanji

    Encodes characters of the japanese alphabet.

In the following field you can enter a string see to which class of characters they belong. This input is also used in the following sections.

Character Numeric Alphanumeric

The string you entered can be encoded in alphanumeric mode.

From characters to bytes

Now that we know which class should be used for encoding, we need to actually encode the data. Because different classes use different encoding schemes and make the explanation quite boring and lengthy

Token Value Encoded (Padded)

Padding and length information

Append the length information and padding bits to the string. This yields the following:

Now that we have our data encoded, we need to convert it to bytes in order to add error correction to them.

Now the data bytes with the additional error correction terms.